Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix llm_config fallback #4415

Merged
merged 15 commits into from
Jan 15, 2025
Merged

Fix llm_config fallback #4415

merged 15 commits into from
Jan 15, 2025

Conversation

enyst
Copy link
Collaborator

@enyst enyst commented Oct 15, 2024

End-user friendly description of the problem this fixes or functionality that this introduces

  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below
    [cli, headless] Restore fallback to defaults for custom LLM configurations.

Give a summary of what the PR does, explaining any non-trivial design decisions

This PR proposes to fix a regression: when we defined custom llm configurations (e.g. sections with user-defined names like [llm.claude], [llm.o1-mini]), we normally didn't have to specify over and over again the same attributes that were in the generic [llm] section: if an attribute is not defined, it falls back to the [llm] section; and if it's not there either, it will fallback to the default values defined in the dataclass (llm_config.py).

This fallback behavior got lost at some point, so now we have to copy/paste all the values we want applied, in each custom section. IMHO, this is a bug, and has pretty bad usability cost. This PR proposes to restore the fallback behavior.

Example use cases where this is useful: well, almost every attribute, particularly embedding model related settings, max_message_chars, sometimes retries values (which we may or may not want to override per each llm), draft llm, saving completions in State for evals debugging, etc.

PR behavior:

  • in custom llms configs, any attribute can be specified, but doesn't have to be specified
  • if an attribute is not defined in a custom llm, it falls back to the [llm] section
  • if it's not there either, it falls back to the default values in the data class
  • if it is overridden in the custom llm, then of course it is overridden. 😅

Current main:

[llm]
model="gpt-o1-mini"
api_key="sk-"
embedding_model="openai"
retry_min_wait=10
num_retries=3
max_message_chars=100_000
log_completions=true

[llm.gpt-4o]
model="gpt-4o"
api_key="sk-"
embedding_model="openai"
retry_min_wait=10
num_retries=3
max_message_chars=100_000
log_completions=true

[llm.gpt35]
model="gpt-3.5-turbo"
api_key="sk-"
embedding_model="openai"
retry_min_wait=10
num_retries=3
max_message_chars=100_000
log_completions=true

[llm.claude-8k]
model="anthropic/claude-3-haiku-20240307"
api_key="sk-ant-"
embedding_model="openai"
retry_min_wait=10
num_retries=3
max_input_tokens=8000
max_message_chars=100_000
log_completions=true

On this PR:

[llm]
model="gpt-o1-mini"
api_key="sk-"
embedding_model="openai"
retry_min_wait=10
num_retries=3
max_message_chars=100_000
log_completions=true

[llm.gpt-4o]
model="gpt-4o"
api_key="sk-"

[llm.gpt35]
model="gpt-3.5-turbo"
api_key="sk-"

[llm.claude-8k]
model="anthropic/claude-3-haiku-20240307"
api_key="sk-ant-"
max_input_tokens=8000

Link of any specific issues this addresses
Addresses this - IMHO it should not be mandatory to define the draft_llm in every custom llm config, in order to see it applied (or get a value at all!) when we use custom llm configs. I think @xingyaoww has addressed this making it fallback to the regular llm, that's cool of course, I just think the original behavior on the draft PR was due to this bug between generic and custom configs.


To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:3e5f40e-nikolaik   --name openhands-app-3e5f40e   docker.all-hands.dev/all-hands-ai/openhands:3e5f40e

@enyst enyst requested review from rbren and xingyaoww October 15, 2024 21:32
Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Is it possible to add some test cases for this to make sure we don't accidentally ruin it again?

@tobitege
Copy link
Collaborator

The LLM section itself is for a specific model. These fallback scenarios may save typing, but end up potentially mixing values from different sections that imo shouldn't.
Non-present values in a custom llm section should only fall back to data class defaults, otherwise it is more complex, not easier, to find out where a value comes from.
E.g. a max_output_tokens value might be left out in "[llm.sonnet]" but suddenly get a value from the "[llm]" section which is just wrong as it is for a different model.
Then it should fallback to the default dataclass value.
Not to mention, that changing the model in the UI completely makes the toml file an unknown in which values are taken.

@enyst
Copy link
Collaborator Author

enyst commented Oct 15, 2024

@tobitege You make good points, and I think there is a solution for the first: the case happens because you need max_out_tokens for the LLM in [llm], then you can name that LLM, right? e.g. [llm.llama]. Then they override as you wish, if I understand correctly: the attributes of llama with llama, claude with claude.

A model is optional in the [llm] section. We can use it or not, at all. As you say, if it helps to actually choose exactly what you want to choose: then name that group.

The issue with how the UI works with toml is tracked here. Just to note here:

Custom LLM configs were developed for CLI, not UI. I'm pretty sure they did not interfere with the UI at all until - as you pointed out recently in slack - the unexpected happened: setting one as "agent" LLM made it actively used with the UI too. That was, as far as I know, an unintended use case, and frankly I didn't even think it could happen until you mentioned it. I didn't even get to test it yet. This issue is not about that.

I think sorting it all out will not be trivial, but if I were to guess, probably the solutions for the interplay between UI and toml will be based on:

  • have LLM settings (all of them or almost) in the UI. That should solve some issues, and, importantly,
  • the suggestion in the linked topic, to actually save in the toml file when you override from the UI; and of course read it, not parts, not unintently, but read it all, should fix many of these things.

IMHO, the way it works now is mostly workable, except for that case - agent llm. I'm not sure how we should address that, without very significant changes.

@enyst
Copy link
Collaborator Author

enyst commented Oct 16, 2024

The LLM section itself is for a specific model.

On further thought, I see your point better. Maybe the solution here doesn't require a lot, just a way to define a "default llm". That means to separate these two views on the [llm] section:

  • when we use CLI with -l, the [llm] section means nothing as an llm, it's not used as such, so it's used as generic
  • otherwise, [llm] section means the llm, the default llm to which the agent falls back to, and other circumstances that needs an llm for completion.

@enyst
Copy link
Collaborator Author

enyst commented Oct 25, 2024

Actually I don't fully see how you encounter the first case you mention. I mean, if you are running main.py --llm_config sonnet sometimes and main.py without config another time, you can name the second one with a custom name. Then you will not get fallbacks.
If that's your preferred workflow, it seems like you can have it, this PR doesn't break the ability, it just restores the ability to switch with 20+ configs, basically. 😅

E.g. a max_output_tokens value might be left out in "[llm.sonnet]" but suddenly get a value from the "[llm]" section which is just wrong as it is for a different model.

Can you please tell how exactly you were encountering this case? e.g. how were you running.

@mamoodi
Copy link
Collaborator

mamoodi commented Nov 14, 2024

@enyst is this something that will still be pursued?

@tobitege
Copy link
Collaborator

I've come around to see the benefit to have the [llm] entries serve as defaults for any non-existant key in custom sections.
This indeed should make it easier for the user and shorten the toml setup.
But could we at least add corresponding documentation for this feature in this PR?

@enyst
Copy link
Collaborator Author

enyst commented Nov 21, 2024

Will do, Tobi, it's on my todo list, I'll get to it tomorrow!

@enyst enyst self-assigned this Nov 27, 2024
@enyst
Copy link
Collaborator Author

enyst commented Dec 14, 2024

@openhands-agent Read this PR diff carefully and understand what it's doing. Read the *_config.py files to understand the config implementation we have. Read also config.template.toml, which is very important because it does already attempt to document how the configuration works.

Then, check out docs/modules/usage for Configuration Options, to see what we have documented so far. Add another .md file to document the behavior of this particular feature: named (custom) LLM configurations. Make sure it does include accurate description of the feature and how to use it.

Note: you should include a link to this new page, in the Configuration Options page under LLM Config, and specify that it is ONLY used with the development setup (development.md, you can find that too).

@openhands-agent
Copy link
Contributor

OpenHands started fixing the pr! You can monitor the progress here.

@openhands-agent
Copy link
Contributor

Overview of Changes:
✓ All issues appear to have been successfully resolved through:

  • Creation of dedicated documentation for custom LLM configs
  • Comprehensive coverage of feature details and usage
  • Proper integration with existing documentation
  • Clear labeling of development-only status

Status: Complete - No remaining issues identified. The implementation addresses all requested documentation needs while maintaining proper structure and transparency about feature limitations.

- Adds proper fallback mechanism from generic LLM config to custom configs
- Adds special handling for draft_editor field:
  - Falls back to generic config value if not specified
  - Can be set to None using 'null' in TOML
  - Can be overridden with custom value
@enyst enyst requested a review from tobitege December 14, 2024 14:35
@enyst
Copy link
Collaborator Author

enyst commented Dec 14, 2024

This is ready for review again, on similar principle as before, consistent fallbacks.

FWIW I'm changing my mind about a few of them, the auth-relevant ones. I doubt excluding some should be implemented on the current code design though, it needs a refactoring first IMO.

@enyst enyst requested a review from xingyaoww December 15, 2024 22:38
@mamoodi
Copy link
Collaborator

mamoodi commented Dec 30, 2024

@enyst is it worth it to ping people for review here now?

Copy link
Collaborator

@rbren rbren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one quesiton

@enyst enyst enabled auto-merge (squash) January 15, 2025 00:56
@enyst enyst mentioned this pull request Jan 15, 2025
1 task
@enyst enyst merged commit c5797d1 into main Jan 15, 2025
14 of 15 checks passed
@enyst enyst deleted the enyst/config_fix branch January 15, 2025 01:17
csmith49 pushed a commit to csmith49/OpenHands that referenced this pull request Jan 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants